Methods for disaster recoverability testing and validation

ABSTRACT

Exemplary methods and computer recovery readiness evaluation process relate to a virtual recovery testing process for Disaster Recovery Plans (DRPs) that can be executed by technical generalists. As such, by implementing the DRP virtual testing process a technical generalist can be charged with the tasks of evaluating and validating documented DRP assumptions, plan execution steps, interoperability dependencies/requirements in addition to the availability of applications, application specific vaulted vital records, and hardware systems that are referenced within the recovery logic of a DRP. Further, the use of established DRP problem management processes to addresses anomalies &amp; deficiencies can also be accomplished.

BACKGROUND

Exemplary embodiments relate generally to disaster recovery planning,and more particularly, to the virtual testing of disaster recoverabilityplans.

Disaster recovery is the process of reinstituting access to data,application, and hardware systems that are critical to resuming businessoperations in the wake of a disaster that has disrupted normal businessoperations. A Disaster Recovery Plan should include information that notonly pertains to the resumption of normal systematic operations, butshould also address any sudden or unexpected key personnel losses.Therefore, an effective Disaster Recovery Plan should take into accountthat the individual that is charged with resuming normal operation maynot be a technical specialist in the field he/she is performing recoveryoperations within. In some instances a business entity may elect not toobtain Disaster Recovery Plan testing or recoverability hardware due tothe high cost of acquisition. While the business accepts the risk oflonger recovery time objectives associated with this decision, theDisaster Recovery Plans for such testing applications must be executableby available human resources and the vital recovery records must beavailable for use in the event of a disaster.

BRIEF SUMMARY

Exemplary embodiments include a method for the testing of disasterrecoverability framework protocols within a computing systemenvironment. The method comprises initially retrieving a disasterrecovery plan, wherein the disaster recovery plan comprises datarestoration logic that is associated with at least one data recoveryactivity. The disaster recovery plan is analyzed in order to identifyany backup data content, application, and hardware resources that arecomprised within computing systems that are referenced within thedisaster recovery plan. The method also comprises identifying a segmentof the disaster recovery plan for evaluation, polling any hardwareresources that have been identified in order to determine if thehardware resources are available, and determining the requirements thatare necessitated for the recovery of the identified applicationresources.

The method yet further comprises evaluating the backup data content,application, and hardware resources in accordance with an identifiedsegment of the disaster recovery plan. The evaluation of the backup datacontent, application, and hardware resources is compared to the datarestoration logic of the disaster recovery plan, and thereafter erroranomalies between the data restoration logic of the identified segmentof the disaster recovery plan and the evaluation of the backup datacontent, application, and hardware resources are identified.

Additional exemplary embodiments include a computer recovery readinessevaluation process that includes a computer readable medium useable by aprocessor, the medium having stored thereon a sequence of instructionswhich, when executed by the user, tests the disaster recoverabilityframework protocols of a disaster recovery plan by initially retrievinga disaster recovery plan, wherein the disaster recovery plan comprisesdata restoration logic that is associated with at least one datarecovery activity. The disaster recovery plan is analyzed in order toidentify any backup data content, application, and hardware resourcesthat are comprised within computing systems that are referenced withinthe disaster recovery plan. The computer recovery readiness evaluationprocess also performs the operation of identifying a segment of thedisaster recovery plan for evaluation, polling any hardware resourcesthat have been identified in order to determine if the hardwareresources are available, and determining an requirements that arenecessitated for the recovery of the identified application resources.

The computer recovery readiness evaluation process yet further performsthe operation of evaluating the backup data content, application, andhardware resources in accordance with an identified segment of thedisaster recovery plan. The evaluation of the backup data content,application, and hardware resources is compared to the data restorationlogic of the disaster recovery plan, and thereafter error anomaliesbetween the data restoration logic of the identified segment of thedisaster recovery plan and the evaluation of the backup data content,application, and hardware resources are identified.

Other methods, and/or computer recovery readiness evaluation processesaccording to embodiments will be or become apparent to one with skill inthe art upon review of the following drawings and detailed description.It is intended that all such additional systems, methods, and/orcomputer recovery readiness evaluation processes be included within thisdescription, be within the scope of the exemplary embodiments, and beprotected by the accompanying claims.

BRIEF DESCRIPTION OF DRAWINGS

Referring now to the drawings wherein like elements are numbered alikein the several FIGURES:

FIG. 1 is a diagram illustrating key components that are to be availablefor the logical testing of a Disaster Recovery Plan in accordance withexemplary embodiments of the present invention;

FIG. 2 is a flow diagram detailing a methodology for the logical testingof a Disaster Recovery Plan in accordance with exemplary embodiments ofthe present invention; and

FIG. 3 illustrates an example of a computer having elements that may beused in implementing exemplary embodiments.

The detailed description explains the exemplary embodiments, togetherwith advantages and features, by way of example with reference to thedrawings.

DETAILED DESCRIPTION OF EXEMPLARY EMBODIMENTS

One or more exemplary embodiments are described below in detail. Thedisclosed embodiments are intended to be illustrative only sincenumerous modifications and variations therein will be apparent to thoseof ordinary skill in the art.

Exemplary embodiments provide standardized evaluation processes andcriteria that can be utilized to evaluate the recoverability of systemapplications for which testing hardware is not available. The exemplaryembodiments also enable Disaster Recovery Plans (DRPs) to be drafted insuch a way as to be executable by technical generalists within afield-technical generalist being personnel who are competent in thegeneral operation of computing systems. As such, a technical generalistcan implement exemplary embodiments and accomplish the goals of thepresented methodology without the need for assistance from personnelintimately familiar with specific applications.

FIG. 1 is a diagram illustrating components that are available for thelogical testing of a DRP in accordance with exemplary embodiments.According to exemplary embodiments, a user of a computer workstation 105may perform virtual recovery testing (VRT) in the event that testinghardware is not available for testing disaster recovery. The computerworkstation 105 may include hardware and software elements for assistinga user of the workstation in conducting virtual recovery testing, e.g.,components for presenting a graphical user interface. The computerworkstation 105 may also include additional hardware and softwareelements of the types conventionally included in personal computers,such as an operating system, but these are not shown for purposes ofclarity.

According to exemplary embodiments, VRT is the full technical review ofany application DRP that cannot be unit tested within a DisasterRecovery (DR) environment due to hardware and/or infrastructureconstraints. For example, in some instances, a business enterprise mayelect not to procure DR testing or recoverability hardware (e.g., due tothe high cost of acquisition for a DR testing application or DRrecoverability hardware components, etc.). While the business enterprisemay accept the potential risk of incurring longer recovery timeobjectives that are due to the decision not to procure a DR testingapplication or DR recoverability hardware, DR plan for the businessenterprise's applications must be executable by a DR specialist and thevital records of the business enterprise must be available for use inthe event of a disaster. Therefore, the exemplary embodiments of the VRTare configured to ensure that a DRP is executable and that any requiredvital records are available for the recovery process.

When conducting a VRT, each section of a DRP may need to be evaluatedfor content and execution validity from a unit testing perspective bypersonnel that are proficient in the computing platform operating systemand database of a recovery environment but not intimately familiar withthe production application itself. In accordance with exemplaryembodiments, the VRT process includes the testing of the logic of allrecovery activities that are documented within a DRP in addition todetermining the availability of the vital records (backup data) thatwill be required for systematic recovery at an offsite recoverylocation. Further, all documented DRP assumptions, plan execution steps,interoperability dependencies/requirements, and the availability ofapplication specific vaulted vital records may be evaluated andvalidated. As a result, any anomalies or deficiencies are documented andreported.

As shown in FIG. 1, at least one DRP is stored in a dedicated databasestorage device 110. A DRP that is targeted for testing may be retrievedvia the workstation 105, responsive to input from a user. Furtherapplications 115, hardware 120, and vital records (backup data) 125 thathave been identified and targeted for testing are also illustrated.

FIG. 2 illustrates a method for disaster recovery testing according toan exemplary embodiment. At step 205 a DRP is retrieved from the storagedevice 110 via a workstation 105. The DRP comprises data restorationlogic that is associated with at least one data recovery activity thatis to be simulated within the VRT computer recovery readiness evaluationprocess 105. At step 210 the DRP is analyzed as a function of the VRTcomputer recovery readiness evaluation process 105 in order to identifythe backup data content 125, application 115, and hardware 120 resourcescomprised within computing systems that are referenced within thedisaster recovery plan.

At step 215, a segment of the DRP is selected for evaluation by aworkstation 106 operator, and the selection information is input to theVRT computer recovery readiness evaluation process 105. At step 225 thehardware 120 resources that have been identified within the DRP by theVRT computer recovery readiness evaluation process 105 are polled as afunction of the VRT computer recovery readiness evaluation process 105in order to determine if the hardware 120 resources are available. Atstep 230 the VRT computer recovery readiness evaluation process 105determines the requirements that will be necessitated for the recoveryof the application 115 resources that have identified within the DRP byevaluating the documented assumptions, plan execution steps,interoperability dependencies/requirements of the DRP. Additionally, theavailability of backup data content 125 identified within the DRP (theavailability of application specific backup data content) is determinedand validated by the VRT computer recovery readiness evaluation process105 at this time. Within further exemplary embodiments the determiningof the availability of identified backup data content further comprisesevaluating the availability of the backup data content 125 that isassociated with the DRP that will be required for a recovery operationthat is to be performed at a predetermined remote location.

At step 235 the backup data content, application, and hardware resourcesare evaluated by the VRT computer recovery readiness evaluation process105 in accordance with the identified recovery logic that is included inthe segment of the DRP that is being evaluated. Further, allsuppositions that are documented within the DRP in addition to recoveryoperation execution steps that are documented within the DRP areevaluated and validated. Yet further, all the backup data content 125,application 115, and hardware 120 resource interoperability dependenciesare evaluated and validated. The analytical evaluation results of thebackup data content 125, application 115, and hardware 120 resources arecompared to the data restoration logic as detailed within the disasterrecovery plan. At step 240 any error anomalies that exist between thedata restoration logic of the identified segment of the disasterrecovery plan, and the evaluation of the backup data content 125,application 115, and hardware 120 resources are documented and reportedto the workstation 106 operator (step 240). The resulting VRT computerrecovery readiness evaluation process 105 DRP analytical evaluationresults can be displayed to the workstation 106 operator or delivered toanother application for further processing.

As mentioned above, the VRT process of the exemplary embodiments can beexecuted by technical generalists. As such, the technical generalist canbe charged with the tasks of evaluating and validating documented DRPassumptions, plan execution steps, interoperabilitydependencies/requirements and the availability of application specificvaulted vital records. Further, the use of established DR problemmanagement processes to addresses anomalies & deficiencies can also beaccomplished.

FIG. 3 illustrates an example of a computer 300 having elements that maybe used in implementing exemplary embodiments. The computer 300includes, but is not limited to, PCs, workstations, laptops, PDAs, palmdevices, servers, mobile devices, data storage systems, and the like.The computer 300 may include a processor 310, memory 320, and one ormore input and/or output (I/O) 370 devices (or peripherals) that arecommunicatively coupled via a local interface (not shown). The localinterface can be, for example but not limited to, one or more buses orother wired or wireless connections, as is known in the art. The localinterface may have additional elements, such as controllers, buffers(caches), drivers, repeaters, and receivers, to enable communications.Further, the local interface may include address, control, and/or dataconnections to enable appropriate communications among theaforementioned components.

According to exemplary embodiments, the processor 310 is a hardwaredevice for executing software that can be stored in the memory 320. Theprocessor 310 can be virtually any custom made or commercially availableprocessor, a central processing unit (CPU), a data signal processor(DSP), or an auxiliary processor among several processors associatedwith the computer 300, and the processor 310 may be a semiconductorbased microprocessor (in the form of a microchip) or a macroprocessor.

The memory 320 can include any one or combination of volatile memoryelements (e.g., random access memory (RAM, such as dynamic random accessmemory (DRAM), static random access memory (SRAM), etc.)) andnonvolatile memory elements (e.g., ROM, erasable programmable read onlymemory (EPROM), electronically erasable programmable read only memory(EEPROM), programmable read only memory (PROM), tape, compact disc readonly memory (CD-ROM), disk, diskette, cartridge, cassette or the like,etc.). Moreover, the memory 320 may incorporate electronic, magnetic,optical, and/or other types of storage media. Note that the memory 320can have a distributed architecture, where various components aresituated remote from one another, but can be accessed by the processor310.

The software in the memory 320 may include one or more separateprograms, each of which comprises an ordered listing of executableinstructions for implementing logical functions. In the exampleillustrated in FIG. 3, the software in the memory 320 includes asuitable operating system (O/S) 350, compiler 340, source code 330, andan application 360 of the exemplary embodiments.

The operating system 350 controls the execution of other computerprograms, and provides scheduling, input-output control, file and datamanagement, memory management, and communication control and relatedservices. It is contemplated by the inventors that the application 360for implementing exemplary embodiments is applicable on all othercommercially available operating systems.

The application 360 may be a source program, executable program (objectcode), script, or any other entity comprising a set of instructions tobe performed. When a source program is to be executed, then the programis usually translated via a compiler (such as the compiler 340),assembler, interpreter, or the like, which may or may not be includedwithin the memory 320, so as to operate properly in connection with theO/S 350. Furthermore, the application 360 can be written as (a) anobject oriented programming language, which has classes of data andmethods, or (b) a procedure programming language, which has routines,subroutines, and/or functions, for example but not limited to, C, C++,C#, Pascal, BASIC, API calls, HTML, XHTML, XML, ASP scripts, FORTRAN,COBOL, Perl, Java, ADA, .NET, and the like.

The I/O 370 devices may include input devices such as, for example butnot limited to, a mouse, keyboard, scanner, microphone, etc.Furthermore, the I/O 370 devices may also include output devices, forexample but not limited to, a printer, display, etc. Also, the I/O 370devices may further include devices that communicate both inputs andoutputs, for instance but not limited to, a NIC or modulator/demodulator(for accessing remote devices, other files, devices, systems, or anetwork), a radio frequency (RF) or other transceiver, a telephonicinterface, a bridge, a router, etc.

When the computer 300 is in operation, the processor 310 is configuredto execute software stored within the memory 320, to communicate data toand from the memory 320, and to generally control operations of thecomputer 300 pursuant to the software. The application 360 and the O/S350 are read, in whole or in part, by the processor 310, perhapsbuffered within the processor 310, and then executed.

When the application 360 is implemented in software, it should be notedthat the application 360 can be stored on virtually any computerreadable medium for use by or in connection with any computer relatedsystem or method. In the context of this document, a computer readablemedium may be an electronic, magnetic, optical, or other physical deviceor means that can contain or store a computer program for use by or inconnection with a computer related system or method.

The application 360 can be embodied in any computer-readable medium foruse by or in connection with an instruction execution system, apparatus,or device, such as a computer-based system, processor-containing system,or other system that can fetch the instructions from the instructionexecution system, apparatus, or device and execute the instructions. Inthe context of this document, a “computer-readable medium” can be anymeans that can store, communicate, propagate, or transport the programfor use by or in connection with the instruction execution system,apparatus, or device. The computer readable medium can be, for examplebut not limited to, an electronic, magnetic, optical, electromagnetic,infrared, or semiconductor system, apparatus, device, or propagationmedium.

More specific examples (a nonexhaustive list) of the computer-readablemedium would include the following: an electrical connection(electronic) having one or more wires, a portable computer diskette(magnetic or optical), a random access memory (RAM) (electronic), aread-only memory (ROM) (electronic), an erasable programmable read-onlymemory (EPROM, EEPROM, or Flash memory) (electronic), an optical fiber(optical), and a portable compact disc memory (CDROM, CD R/W) (optical).Note that the computer-readable medium could even be paper or anothersuitable medium, upon which the program is printed or punched, as theprogram can be electronically captured, via for instance opticalscanning of the paper or other medium, then compiled, interpreted orotherwise processed in a suitable manner if necessary, and then storedin a computer memory.

In exemplary embodiments, where the application 360 is implemented inhardware, the application 360 can be implemented with any one or acombination of the following technologies, which are each well known inthe art: a discrete logic circuit(s) having logic gates for implementinglogic functions upon data signals, an application specific integratedcircuit (ASIC) having appropriate combinational logic gates, aprogrammable gate array(s) (PGA), a field programmable gate array(FPGA), etc.

Further, one or more applications 360 may be configured to implementvarious operations and processes of exemplary embodiments discussedherein. For example, the application 360 may be configured to implementthe VRT computer recovery readiness evaluation process, methods fordisaster recovery testing, data restoration logic, etc., in accordancewith exemplary embodiments.

As described above, the exemplary embodiments can be in the form ofcomputer-implemented processes and apparatuses for practicing thoseprocesses. The exemplary embodiments can also be in the form of computerprogram code containing instructions embodied in tangible media, such asfloppy diskettes, CD ROMs, hard drives, or any other computer-readablestorage medium, wherein, when the computer program code is loaded intoand executed by a computer, the computer becomes an apparatus forpracticing the exemplary embodiments. The exemplary embodiments can alsobe in the form of computer program code, for example, whether stored ina storage medium, loaded into and/or executed by a computer, ortransmitted over some transmission medium, loaded into and/or executedby a computer, or transmitted over some transmission medium, such asover electrical wiring or cabling, through fiber optics, or viaelectromagnetic radiation, wherein, when the computer program code isloaded into an executed by a computer, the computer becomes an apparatusfor practicing the exemplary embodiments. When implemented on ageneral-purpose microprocessor, the computer program code segmentsconfigure the microprocessor to create specific logic circuits.

While the invention has been described with reference to exemplaryembodiments, it will be understood by those skilled in the art thatvarious changes may be made and equivalents may be substituted forelements thereof without departing from the scope of the invention. Inaddition, many modifications may be made to adapt a particular situationor material to the teachings of the invention without departing from theessential scope thereof. Therefore, it is intended that the inventionnot be limited to the particular embodiments disclosed for carrying outthis invention, but that the invention will include all embodimentsfalling within the scope of the claims. Moreover, the use of the termsfirst, second, etc. do not denote any order or importance, but ratherthe terms first, second, etc. are used to distinguish one element fromanother. Furthermore, the use of the terms a, an, etc. do not denote alimitation of quantity, but rather denote the presence of at least oneof the referenced item.

1. A method for the testing of disaster recoverability frameworkprotocols within a computing system environment, the method comprising:analyzing a disaster recovery plan including data restoration logic thatis associated with at least on data recovery activity in order toidentify backup data content, application, and hardware resourcescomprised within computing systems that are referenced within thedisaster recovery plan; identifying a segment of the disaster recoveryplan for evaluation; polling the identified hardware resources todetermine if the hardware resources are available; determiningrequirements for the recovery of the identified application resources;evaluating the backup data content, application, and hardware resourcesin accordance with the identified segment of the disaster recovery plan;comparing the evaluation of the backup data content, application, andhardware resources to the data restoration logic of the disasterrecovery plan; and identifying error anomalies between the datarestoration logic of the identified segment of the disaster recoveryplan and the evaluation of the backup data content, application, andhardware resources.
 2. The method of claim 1, further comprisingdetermining the availability of the identified backup data content. 3.The method of claim 2, where in response to the identification ofrecovery operation execution anomalies between the data restorationlogic of the identified segment of the disaster recovery plan and theevaluation of the backup data content, application, and hardwareresources, an error report is generated detailing the occurrence of anyerror anomaly.
 4. The method of claim 3, further comprising evaluatingthe backup data content, application, and hardware resources inaccordance with the data restoration logic of all of the data recoveryactivities that are comprised within the disaster recovery plan.
 5. Themethod of claim 4, wherein all suppositions that are documented withinthe disaster recovery plan are evaluated and validated.
 6. The method ofclaim 5, wherein all recovery operation execution steps that aredocumented with the disaster recovery plan are evaluated and validated.7. The method of claim 6, wherein all backup data content, application,and hardware resource interoperability dependencies are evaluated andvalidated.
 8. The method of claim 7, wherein the availability ofapplication specific backup data content is determined.
 9. The method ofclaim 4, wherein determining the availability of the identified backupdata content further comprises evaluating the availability of the backupdata content that is associated with the disaster recovery plan thatwill be required for a recovery operation that is to be performed at apredetermined remote location.
 10. A computer recovery readinessevaluation process that includes a computer readable medium useable by aprocessor, the medium having stored thereon a sequence of instructionswhich, when executed by the user, tests the disaster recoverabilityframework protocols of a disaster recovery plan by: analyzing a disasterrecovery plan including data restoration logic that is associated withat least one data recovery activity in order to identify backup datacontent, application, and hardware resources comprised within computingsystems that are referenced within the disaster recovery plan; receivinginput identifying a segment of the disaster recovery plan forevaluation; polling the identified hardware resources to in order todetermine if the hardware resources are available; determiningrequirements for the recovery of the identified application resources;evaluating the backup data content, application, and hardware resourcesin accordance with the identified segment of the disaster recovery plan;comparing the evaluation of the backup data content, application, andhardware resources to the data restoration logic of the disasterrecovery plan; and identifying error anomalies between the datarestoration logic of the identified segment of the disaster recoveryplan and the evaluation of the backup data content, application, andhardware resources.
 11. The computer recovery readiness evaluationprocess of claim 10, further comprising determining the availability ofthe identified backup data content.
 12. The computer recovery readinessevaluation process of claim 11, where in response to the identificationof recovery operation execution anomalies between the data restorationlogic of the identified segment of the disaster recovery plan and theevaluation of the backup data content, application, and hardwareresources an error report is generated detailing the occurrence of anyerror anomaly.
 13. The computer recovery readiness evaluation process ofclaim 12, further comprising evaluating the backup data content,application, and hardware resources in accordance with the datarestoration logic of all of the data recovery activities that arecomprised within the disaster recovery plan.
 14. The computer recoveryreadiness evaluation process of claim 13, wherein all suppositions thatare documented within the disaster recovery plan are evaluated andvalidated.
 15. The computer recovery readiness evaluation process ofclaim 14, wherein all recovery operation execution steps that aredocumented with the disaster recovery plan are evaluated and validated.16. The computer recovery readiness evaluation process of claim 15wherein all backup data content, application, and hardware resourceinteroperability dependencies are evaluated and validated.
 17. Thecomputer recovery readiness evaluation process of claim 16, wherein theavailability of application specific backup data content is determined.18. The computer recovery readiness evaluation process of claim 14,wherein the determining of the availability of the identified backupdata content further comprises evaluating the availability of the backupdata content that is associated with the disaster recovery plan thatwill be required for a recovery operation that is to be performed at apredetermined remote location.
 19. A method for the testing of disasterrecoverability framework protocols within a computing systemenvironment, the method comprising: analyzing a disaster recovery planincluding data restoration logic that is associated with at least onedata recovery activity in order to identify backup data content,application, and hardware resources comprised within computing systemsthat are referenced within the disaster recovery plan; identifying asegment of the disaster recovery plan for evaluation; polling theidentified hardware resources to determine if the hardware resources areavailable; determining requirements for the recovery of the identifiedapplication resources; evaluating the backup data content, application,and hardware resources in accordance with the identified segment of thedisaster recovery plan; comparing the evaluation of the backup datacontent, application, and hardware resources to the data restorationlogic of the disaster recovery plan; identifying error anomalies betweenthe data restoration logic of the identified segment of the disasterrecovery plan and the evaluation of the backup data content,application, and hardware resources; and determining the availability ofthe identified backup data content.
 20. The method of claim 19, where inresponse to the identification of recovery operation execution anomaliesbetween the data restoration logic of the identified segment of thedisaster recovery plan and the evaluation of the backup data content,application, and hardware resources, an error report is generateddetailing the occurrence of any error anomaly.